<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML
><HEAD
><TITLE
>XML Type</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK
REV="MADE"
HREF="mailto:pgsql-docs@postgresql.org"><LINK
REL="HOME"
TITLE="PostgreSQL 9.1.2 Documentation"
HREF="index.html"><LINK
REL="UP"
TITLE="Data Types"
HREF="datatype.html"><LINK
REL="PREVIOUS"
TITLE="UUID Type"
HREF="datatype-uuid.html"><LINK
REL="NEXT"
TITLE="Arrays"
HREF="arrays.html"><LINK
REL="STYLESHEET"
TYPE="text/css"
HREF="stylesheet.css"><META
HTTP-EQUIV="Content-Type"
CONTENT="text/html; charset=ISO-8859-1"><META
NAME="creation"
CONTENT="2011-12-01T22:07:59"></HEAD
><BODY
CLASS="SECT1"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="5"
ALIGN="center"
VALIGN="bottom"
><A
HREF="index.html"
>PostgreSQL 9.1.2 Documentation</A
></TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="top"
><A
TITLE="UUID Type"
HREF="datatype-uuid.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="top"
><A
HREF="datatype.html"
ACCESSKEY="U"
>Up</A
></TD
><TD
WIDTH="60%"
ALIGN="center"
VALIGN="bottom"
>Chapter 8. Data Types</TD
><TD
WIDTH="20%"
ALIGN="right"
VALIGN="top"
><A
TITLE="Arrays"
HREF="arrays.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="SECT1"
><H1
CLASS="SECT1"
><A
NAME="DATATYPE-XML"
>8.13. <ACRONYM
CLASS="ACRONYM"
>XML</ACRONYM
> Type</A
></H1
><P
>    The <TT
CLASS="TYPE"
>xml</TT
> data type can be used to store XML data.  Its
    advantage over storing XML data in a <TT
CLASS="TYPE"
>text</TT
> field is that it
    checks the input values for well-formedness, and there are support
    functions to perform type-safe operations on it; see <A
HREF="functions-xml.html"
>Section 9.14</A
>.  Use of this data type requires the
    installation to have been built with <TT
CLASS="COMMAND"
>configure
    --with-libxml</TT
>.
   </P
><P
>    The <TT
CLASS="TYPE"
>xml</TT
> type can store well-formed
    <SPAN
CLASS="QUOTE"
>"documents"</SPAN
>, as defined by the XML standard, as well
    as <SPAN
CLASS="QUOTE"
>"content"</SPAN
> fragments, which are defined by the
    production <TT
CLASS="LITERAL"
>XMLDecl? content</TT
> in the XML
    standard.  Roughly, this means that content fragments can have
    more than one top-level element or character node.  The expression
    <TT
CLASS="LITERAL"
><TT
CLASS="REPLACEABLE"
><I
>xmlvalue</I
></TT
> IS DOCUMENT</TT
>
    can be used to evaluate whether a particular <TT
CLASS="TYPE"
>xml</TT
>
    value is a full document or only a content fragment.
   </P
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="AEN6930"
>8.13.1. Creating XML Values</A
></H2
><P
>    To produce a value of type <TT
CLASS="TYPE"
>xml</TT
> from character data,
    use the function
    <CODE
CLASS="FUNCTION"
>xmlparse</CODE
>:
</P><PRE
CLASS="SYNOPSIS"
>XMLPARSE ( { DOCUMENT | CONTENT } <TT
CLASS="REPLACEABLE"
><I
>value</I
></TT
>)</PRE
><P>
    Examples:
</P><PRE
CLASS="PROGRAMLISTING"
>XMLPARSE (DOCUMENT '&#60;?xml version="1.0"?&#62;&#60;book&#62;&#60;title&#62;Manual&#60;/title&#62;&#60;chapter&#62;...&#60;/chapter&#62;&#60;/book&#62;')
XMLPARSE (CONTENT 'abc&#60;foo&#62;bar&#60;/foo&#62;&#60;bar&#62;foo&#60;/bar&#62;')</PRE
><P>
    While this is the only way to convert character strings into XML
    values according to the SQL standard, the PostgreSQL-specific
    syntaxes:
</P><PRE
CLASS="PROGRAMLISTING"
>xml '&#60;foo&#62;bar&#60;/foo&#62;'
'&#60;foo&#62;bar&#60;/foo&#62;'::xml</PRE
><P>
    can also be used.
   </P
><P
>    The <TT
CLASS="TYPE"
>xml</TT
> type does not validate input values
    against a document type declaration
    (DTD),
    even when the input value specifies a DTD.
    There is also currently no built-in support for validating against
    other XML schema languages such as XML Schema.
   </P
><P
>    The inverse operation, producing a character string value from
    <TT
CLASS="TYPE"
>xml</TT
>, uses the function
    <CODE
CLASS="FUNCTION"
>xmlserialize</CODE
>:
</P><PRE
CLASS="SYNOPSIS"
>XMLSERIALIZE ( { DOCUMENT | CONTENT } <TT
CLASS="REPLACEABLE"
><I
>value</I
></TT
> AS <TT
CLASS="REPLACEABLE"
><I
>type</I
></TT
> )</PRE
><P>
    <TT
CLASS="REPLACEABLE"
><I
>type</I
></TT
> can be
    <TT
CLASS="TYPE"
>character</TT
>, <TT
CLASS="TYPE"
>character varying</TT
>, or
    <TT
CLASS="TYPE"
>text</TT
> (or an alias for one of those).  Again, according
    to the SQL standard, this is the only way to convert between type
    <TT
CLASS="TYPE"
>xml</TT
> and character types, but PostgreSQL also allows
    you to simply cast the value.
   </P
><P
>    When a character string value is cast to or from type
    <TT
CLASS="TYPE"
>xml</TT
> without going through <TT
CLASS="TYPE"
>XMLPARSE</TT
> or
    <TT
CLASS="TYPE"
>XMLSERIALIZE</TT
>, respectively, the choice of
    <TT
CLASS="LITERAL"
>DOCUMENT</TT
> versus <TT
CLASS="LITERAL"
>CONTENT</TT
> is
    determined by the <SPAN
CLASS="QUOTE"
>"XML option"</SPAN
>
    
    session configuration parameter, which can be set using the
    standard command:
</P><PRE
CLASS="SYNOPSIS"
>SET XML OPTION { DOCUMENT | CONTENT };</PRE
><P>
    or the more PostgreSQL-like syntax
</P><PRE
CLASS="SYNOPSIS"
>SET xmloption TO { DOCUMENT | CONTENT };</PRE
><P>
    The default is <TT
CLASS="LITERAL"
>CONTENT</TT
>, so all forms of XML
    data are allowed.
   </P
><DIV
CLASS="NOTE"
><BLOCKQUOTE
CLASS="NOTE"
><P
><B
>Note: </B
>     With the default XML option setting, you cannot directly cast
     character strings to type <TT
CLASS="TYPE"
>xml</TT
> if they contain a
     document type declaration, because the definition of XML content
     fragment does not accept them.  If you need to do that, either
     use <TT
CLASS="LITERAL"
>XMLPARSE</TT
> or change the XML option.
    </P
></BLOCKQUOTE
></DIV
></DIV
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="AEN6974"
>8.13.2. Encoding Handling</A
></H2
><P
>    Care must be taken when dealing with multiple character encodings
    on the client, server, and in the XML data passed through them.
    When using the text mode to pass queries to the server and query
    results to the client (which is the normal mode), PostgreSQL
    converts all character data passed between the client and the
    server and vice versa to the character encoding of the respective
    end; see <A
HREF="multibyte.html"
>Section 22.3</A
>.  This includes string
    representations of XML values, such as in the above examples.
    This would ordinarily mean that encoding declarations contained in
    XML data can become invalid as the character data is converted
    to other encodings while travelling between client and server,
    because the embedded encoding declaration is not changed.  To cope
    with this behavior, encoding declarations contained in
    character strings presented for input to the <TT
CLASS="TYPE"
>xml</TT
> type
    are <SPAN
CLASS="emphasis"
><I
CLASS="EMPHASIS"
>ignored</I
></SPAN
>, and content is assumed
    to be in the current server encoding.  Consequently, for correct
    processing, character strings of XML data must be sent
    from the client in the current client encoding.  It is the
    responsibility of the client to either convert documents to the
    current client encoding before sending them to the server, or to
    adjust the client encoding appropriately.  On output, values of
    type <TT
CLASS="TYPE"
>xml</TT
> will not have an encoding declaration, and
    clients should assume all data is in the current client
    encoding.
   </P
><P
>    When using binary mode to pass query parameters to the server
    and query results back to the client, no character set conversion
    is performed, so the situation is different.  In this case, an
    encoding declaration in the XML data will be observed, and if it
    is absent, the data will be assumed to be in UTF-8 (as required by
    the XML standard; note that PostgreSQL does not support UTF-16).
    On output, data will have an encoding declaration
    specifying the client encoding, unless the client encoding is
    UTF-8, in which case it will be omitted.
   </P
><P
>    Needless to say, processing XML data with PostgreSQL will be less
    error-prone and more efficient if the XML data encoding, client encoding,
    and server encoding are the same.  Since XML data is internally
    processed in UTF-8, computations will be most efficient if the
    server encoding is also UTF-8.
   </P
><DIV
CLASS="CAUTION"
><P
></P
><TABLE
CLASS="CAUTION"
BORDER="1"
WIDTH="100%"
><TR
><TD
ALIGN="CENTER"
><B
>Caution</B
></TD
></TR
><TR
><TD
ALIGN="LEFT"
><P
>     Some XML-related functions may not work at all on non-ASCII data
     when the server encoding is not UTF-8.  This is known to be an
     issue for <CODE
CLASS="FUNCTION"
>xpath()</CODE
> in particular.
    </P
></TD
></TR
></TABLE
></DIV
></DIV
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="AEN6986"
>8.13.3. Accessing XML Values</A
></H2
><P
>    The <TT
CLASS="TYPE"
>xml</TT
> data type is unusual in that it does not
    provide any comparison operators.  This is because there is no
    well-defined and universally useful comparison algorithm for XML
    data.  One consequence of this is that you cannot retrieve rows by
    comparing an <TT
CLASS="TYPE"
>xml</TT
> column against a search value.  XML
    values should therefore typically be accompanied by a separate key
    field such as an ID.  An alternative solution for comparing XML
    values is to convert them to character strings first, but note
    that character string comparison has little to do with a useful
    XML comparison method.
   </P
><P
>    Since there are no comparison operators for the <TT
CLASS="TYPE"
>xml</TT
>
    data type, it is not possible to create an index directly on a
    column of this type.  If speedy searches in XML data are desired,
    possible workarounds include casting the expression to a
    character string type and indexing that, or indexing an XPath
    expression.  Of course, the actual query would have to be adjusted
    to search by the indexed expression.
   </P
><P
>    The text-search functionality in PostgreSQL can also be used to speed
    up full-document searches of XML data.  The necessary
    preprocessing support is, however, not yet available in the PostgreSQL
    distribution.
   </P
></DIV
></DIV
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="datatype-uuid.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="index.html"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="arrays.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><ACRONYM
CLASS="ACRONYM"
>UUID</ACRONYM
> Type</TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="datatype.html"
ACCESSKEY="U"
>Up</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>Arrays</TD
></TR
></TABLE
></DIV
></BODY
></HTML
>